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ABSTRACT 

In the preparation of a thesaurus for the Language 
Inforncition Network anot Clearinghouse System (LINCS) ^ a number of 
already existing indexing tools was consulted. Many of these provide 
important secondary sources of terminology and of term relations^ in 
addition to the primary sources available in original texts. Others 
are models of thesaurus construction^ and some are indexing tools 
with which the LINCS thesaurus could interface. The discussion of 
indexing tools^ which follows^ first treats briefly two models of 
thesaurus construction (Project LEX and Boget's) which are relevant 
to LINCSy then discusses in some detail secondary terminology 
sources^ their nature^ and the way in which they can be used. 
Finally^ the report consider.r* the question of interface in view of 
the nature of several indexing tools in contact with LINCS. An 
appendix lists core terms in the LINCS collection of language 
oriented terms^ as well as their intersection with another source. 
(Author/Aa) 
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PRiiFACZ 



^^-^-i-'n-: '^t^'''";' ^^""'^ Lansuag,. Information Network 

r.o-r^Tn H^r" T'l^'- ^ nunfoer of already existing indexing 

s^u-'^-r^i consultea.. K^ny of thase provide i^r.portant secondary 

t.2i::anology and of terr. relations - in addition to the pri- 
mary sources available in or^-inal ^e•.■ts O--^- ■ i ^° /'^'^ 
(.o^^. .„ . -^.-.ts. Ou-.-ara c:rr. r.oaals of tnesaurus 

vii^';;;-;"': " ' '•■•■■^^^■^ ^-^'-'^s thesaurus 

.vix^ proj^Dxy interface xn tne future. 

T^-o ::id"K'''°'\r^' indexing tools, vhich f ollovs , first treats briefly 
.^o ...,aalb o. tliesaurus construction uhich ara relpvant to LIMCS H^-n 

^^rt^.^^r.^tT ^^^^^'^^^^ --inology sources, thei^'nftu el" 
t'it ' i ""^^ °^ ^'i^^-'-^-'iy considers the ques- 

.^:iov:M""r-"^^ °' "^^'-^^^ °^ ^^^^^^-^ ---"S tools'which 

.•xi± p^ojciJl} DC m coutr.ct witii LINCS. 
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1- TIiesr.i.iru^ ■•iodel.s 



1.1. ProTdct LEX 

Since LIiCCS has decided to adapt the COSATi [181 c;uidiir.es for its cvn 
thesaurus construction, and since those guidelines are n-ost full" 
e::cn:pUfied in the Project LEX Th esaurus [20! the latter is a practical 
model ror LI}:CS and useful in uudc-rscandino guidelines. For exar^nle 
the guidelines state that one should be ahle to say of a narrower" term 
(..1) taat It IS a' broader terr, (3T) , e.g., a tree "is a" pl.i.ic Ho.v- 
ever, the Project LEX thesaurus shows that this may be interoreted rather 
Droauly m that it gives linr^ui^^tic^ as a BT to rhonolo oy. v;h.-cP in a 
strict interpretation would r.iean that phonology "is a" linguistics an 
assertion which is not acceptable at face A'alue. ' 

t* 

On tne -.■hole, Projoct LLX iiidicates feasible limits to the interorefition 
of the COSATI guidelines by Illustration. It is a valunble reference tool 
ror tne loo oi tnesaurus construction. Other recent thesauri ba^ed on 
the guidelines r.iay also serve this function. 

1-2. Rnr^et's Th.esaurus 

Thesauri can be characterized as informative or merely correct. That is 
m looking up a term X, the user (a) may find onlv the instruction "USE ' 
term Y, which gives him no way of tolling u-hether term Y is wh?t he 
wants, or (b) he may find the instruction "USE term Y" plus a structured 
list o. terms related to Y so that he knows whether he wants to search 
for / or for so:.ie other term, perhaps one related to Y. In case (b) the 
user does not have to riffle through the thesaurus a number of times 
before ho can initiate his search. 

The most famous of all thesauri is Roget's Thesaurus [49] first published 
m 18 j2 and still widely u.sed. It can be of some help to LINCS in 
formulating an approach to the user of the thesaurus. The dictionary 
form of Roget s Ihesaurus, as contrasted with the original, gives maximum 
help ana information to the user. That is, at e..ch entry it gives rather 
complete information as well as instructions as l.o where to find more. 

The fact that Roget's Thesaurus continues to be sold and that it -uses 
approach (b) ratuer than approach (a) is an indication that experience 
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has shovm the value of being helpful to the user. 



2 . Terming loff v S our ces 



In discussing the sources of terminology, attention is focused below 
on the practical questions of when, how, and for vhat purpose each 
type or individual iterr. can be used. Thus the :;ources are placed in 
four r-ajor divisions: (1) sources for initial construction, (2) 
sources for ercpansion, (3) sources for completion, (4) sources^for 
lons-"r-,n-e niaintenance and updating. 

The basic method of using indexing tools or other terminology sources 
IS CO incorporate a part or all of a given source into the thesaurus 
v.'hile deleting superfluous main entries through the USE notation, e.<-. 
five avid ten cenc store USE dime store. '"^^ 

To incorporate simply r.ean.s to merge the list of desired terms with 
tno list ot teri:i3 already in the thesaurus at that point in" its develop- 
ment, tying in die given hierarchical relations (broader terms, narrower 
terms, related term.s in a COSATI thesaurus) and making all apparenc 
connections with items already in the thesaurus. For this merging, an 
alphabetical list of all terms in the thesaurus is extremely helpful. 
Use of the term 'merge* does not imply direct computer intervention/ 
Rather, this is the normal human method for developing the list of 
thesaurus terms and relations. It involves making judgm.ents about 
relative positions in a hierarchy, about the relative correctness of 
differing postulated relationships, and about the possibilities of 
coe::istont but different sets of relationships. 

These judgments can, to a large e:ctent, be based on the information 
available in the indexing tools. Brief outlines and classification 
schemes are likely to be in general agreement v;ith each other about 
the structure of the overall hierarchy even though each is likely to 
have a few terms not shared by other sources. Longer sources such as 
dictionaries and textbooks can be used to resolve differences. 

The task of incorporation can be com.puter-aided provided a suitable 
program is available, the computer can generate recinrocals, thus 
enabling the thesaurus builder to be logically consistent. If he 
introduces a term a and asserts that it is related co some other 
term i, the generacLon of reciprocais ensures that term Y v;ill have 
as one related term (HI) the term X. Similarly if Y is 
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asserced to be a broader term (UT) t;o X, genera tier of reciprocals 
will make the atiaiiemeat that: X is a narrov/ar tern (KT) to Y. 

2.1. Sources for Initial Construction 



In the initial construction of a thesaurus, it is obviously necessary 
to establish a first list of terros to which additions can then bi^ made. 
Sources should be chosen for initial use on the basis of brevity, 
generality, class if icatory nature, and orientation towards central 
areas of concern in the language sciences. I)revi':y and generality are 
not independent of each other • If fewer tern^s are used to describe 
an area, they v/iil probably be more general terms. Sources c/hich have 
classifications help provide sonie of the hierarchical information needed 
in a thesaurus. And finally, to be a thesaurus in the arei of the 
lanriuage sciences, the thesaurus must be confined to that area when 
incorporating terns . 

For the experiD.ental pilot thesaurus [43] constructed for LIuCS in the 
spring of 195 a list of language specialties developed at Utg Center for 
Applied Linguistics was ur.cid as the basic list. Then three other rather 
general sources v;ere incorporated: (a) terms from the Project LE\ 
Thesaurus [20] under the headings of lan^z tiai:e and lin.e.uis tics , (b) language 
related terras from the Hunun Re l^z Lions Area Files Outlin e of Cultural 
Materia Is [34], scctioub- 19 (language) and 20 (comiuunica tions) , and (c) 
terms from the ETIC classification [8], The first tv70 of these sources 
"contributed very few further terms. Language is not their primary 
orientation, so the lists of termi; were rather short and general, over- 
lapping greatly with the basic list. The third contributed a good many 
terms, probably for tv/o reasons: it is of British origin and thus there 
are national differences in terminology, and also since its primary 
concern is language teaching, the number of teri.;' in the language area 
is great. 

Further sources for initial construction are here grouped first by 
orientation and second by length: (1) sources devoted wholly to linguistic 
subjects (Section 2.1.1), and (2) sources v/ith language-related sectioias 
(Section 2.1.2). In each group, the shorter sources can be incorporated 
into, and probably v/ill add fev; terms to a basic list, while the longer 
sources may best be incorporated selectively. 
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2.1.1. Khollv LJ.rmuistic Sources 



(a) Shorter Sources (all are tables of cont^afco) 

1. Linr-uistic ]jiblio c:ranhv [18]. 

2. ERIC Clearinghouse for Linguistics, 1566 
Selected Bibliography [22 J. 

3. Rice and Guss, Information Sourcr-s in 
Lin.quistics [4S j , 

^- -^llcn, Lins::uir. tic s and Enelish Lini zuisti r.^ [l] . 

5. Textbooks and other major books in the lan-uagcs 
sciences, e.g. Saussure [51], Bloonf ield . [7 ] 
llockett [32], Gleason [25], Sapir [50], Chomsky 

(b) Longer Sources (all arc classification schemes) 

1. Trager, "A Bibliographical Classification System 
for Linguistics and Languages," 1945 [45]. 

2. CL:.^ificati np. of the Summer Tn<=titutc of • 
Linj^uitjtics Dib Lior:ranhv . 1967 [53]. 

^- Idi!?i;'-'qSP Research in Pro gress thesaurus 
[Ij ."^ 

^' ET[C/CII.T Classification [8], 

■5- Thesannis of Bilineu alism Descrip i-or-.. [35] . 

6. Language and Lan^^nn ge Behavior Abs tracts [1?]. ■ 

The Lin ^iistic Bibllq araphv [18] is generally regarded as the main 
syste,.atic library tool available to linguists. In spite of its non- 

-i^-z r::?/^^/^;-;j-\t?fsh"it^".ise"e.\"i:-^ T-'-- 

provides a first step towards" .^Z^^l 
than .0 terms, excluding language names, and these terms are not exLcnsiWely 



xncerrelatied. Language nam>:!s arc cha first basis of division; then 
und?.r each laxnguaoe, linguistic terms are used. Tue field is es^-en- 
tially divided into the following segments: 

1. Bibliograpliy and general (including also linguistic 
theory and method, typology, termlnologv, history of 
linguistics). 

2. Phonetics ana phonology (phoneniLcs , historical phone- 
tics, descriptive and exparimeatal phonetics). 

3. Gramrnar (morphology and syntax). 

4. History of language. 

3. Linguistic geography and dialectology. 

6. Vocabulary (lexicography, etymology, setnantics) . 

7 . Scrip t , o r t ho gr a hy . 

8. Stylistics. 

9. Prosody, ir.stre, versification. 

10. Translation (general, mechanical translation). 

11. Mathematical linguistics. 

12. .Philosophy, psychology, and sociology of language. 

13. Miscellaneous (bilingualism, child language, aphasia 
speech disorders, auxiliary languages). ' ' 

14. Language teaching. 

15. Onomastics. 

Insofar as linguists have trained theinselvcs to look at language ia Hits 

way, tiiese terms must be included as enCry points at least. 

Z^, .TTr'^l'''"'''^^''^^ ^''^^ ^^^^ table of contents to 1966 

U2j. has oomev.-hat: more importance them simply an'^ her table o f" 

contents xa that it is a part of a nationwide information system and 
thus xn the future , , bably will be the most generally available source 
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as v;ell as bdng organi^ationaily related to LIKCS. 

The Rico and Guss hai-idbpok [48] organizes the field in terms of 
fields within linguistics and interdisciplinary fields. It is brief 
and for sorr.e linguists has served as a very important reference v7ork, * 
Allen's bibliography [1] is also very brief. Though it concentrates 
on Lm;li.sh linguistics, this parallels the field of linguistics, 
especially in recent years. 

Major books in linguistics vary considerably in hov; much the actual table 
of contents reflects the terminology of linguistics. Textbooks in parti- 
cular, ho-;c:ver, devote quite a bit of attention to terms. For example, 
each c]:apter of ilockett^s introductory text [32] closes v/ith a reviev; of 
iniportai't terminology. In this same general area fall the catalogs 
published by companies such as Mouton, v/ho are commercially concerned 
with directing us2rs to books they could rant. 

Tr.:»ger's classification is the most thorough yet done for linguistics. 
It appeared in articles in the journal Studies in Linguistics in 1945 and 
1957,, In his 1945 article, Trager said, ^'Trie arrangement of the subject 
matter of linguistics follov:s the general outlines used by all linguists. 
Detailed subdivisions have, been worked out by lue, using such library 
classifications a« Library of Congress and Dewey *s. for suggestions,.. 
[54, p. 55]. This source ought to bo incorporated regardless of its 
length, since it represents a definitely Unguis tic'approach, Trager's 
main divisions of "linguistic form" as he terms it are: 

1. Writing 

2 . Phonology 

3 . Morphology 

4. Syntax 

5. Lexicology 

6. Etymology 

7. Semantics 

3 . Dialectology 
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9, Philology. 

rna clasi^.lficatioa schema for ,,ork done by the Sur;ir.cr Institute of Lin- 
guxcr,cs Lo3] IS a valuable .tr.fornation source on exotic languages. Also 
It represents to some extent one of the :najor theories of liAguistic 
description, namely tagmenics. 

l,an|.jasejlese^^ docuT-ented research in linguis- 

UX..S .ad related fields. Its terns reflect the interest of researchers 

work IS listed. The tnost elaborated categories in ten^s of (a) hav^ 
suocate.ories and (b) being cross-referenced are^ the following tL in Srier' 
°' l^'^'-"i"3. phonetics, psycholinguis tics , naanin^^, 

so.i , linguistics, r,:emory, word association, auditory con>munication , speech 
pat.h...op-, and coxp.on languages. These seem to focus soiaevhat nore on 
p.ycno.inguistics and speech than on the core of linguistics as expressed 
m tae classi tications of the Lin guistic Biblio<>raphv and of Trager^ 

s^'ruc-i^roF ''-^ mClCl^Classi^^ [g] was used in the con- 

struction or an experi.r.ntal pilot t!iei;auru.3 , to Tvhich, as the fourth 
source incorporated, it added about GO terms. The British Council laain- 
Uin^ the^ngliOi-Teachiag Information Centre (ETIC) as a study centre 
and clearinghouse for inforr.ation on all aspects of teaching English as 

%n?r''"-'"''"'^". '^""^^^ -formation on Language 

loach.ng (CILi) perrorms similar functions for all aspects of nodern 
languages and their teaching. Together ' these centres maintain a l^guage- 

rsL -"b!" iu"^' ^'''^'^'l^^^I^^^-^^ which includes 

a sizabiu number of purely linguistic documents. ~ 

The International Centre for Research on Bilingualism in Quebec is pre- 
paring a lhesauriis_of_Bi]^^ [36J. I, servers 
a raodel of treatment of language vocabula^ ' 

iTf ^y.t^?^.'^'"''^ ^' '' Behavior Abstracts [13] is a fairly new publication. 
A loo.< at Its section headings indicates that its areas of interest are 
similar to those of Lang-gage Research in Progress: However, its document 
sources may be rather different. A thesaurus of 4,000 terms is forth 
coming from the compilers of Language and Language Behavior Abstracts . 

2.1.2. Sources with Language-Related Sections 

• a. Library classifications. 

12 
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1. Dev-ey Deciinal Systeni [21] . 

2. Lib,vary of CongresG [58] . 

3. Universal Deciiial Classification [39, 60J. 
b. Other 

1. Human Relations Area Files [37]. 

2. IiiiU£.yil^gil^^ [10]. 

3. >UJi Bibllonranhv [45]. 

4. Social Science Documentation [37, 3SJ. 

5. CuL Collection of Languasc-Orionted Terms [41]. 

'f'i:^'"'''^"' '^T'''' ^^^^^ ^^^^^ ^^^^y torninolo^y 

vo^^^^^v -.^ '^^i-- are of little use, in that thoy have 

vcu, rc./ xan^uagc-relatod tern:3, and those arc^ extremely roaoral a d 
scat cred about. The source above have fairly well'dl'injrsection^ 
dealir.c; with language, with the exception of Social <^ 'nZ ^^'^^''''"^ 
{'^7 r*hnrsu ' . ^'^^^i^^^uu or bociai Science Documentation 

^or^ t^ VliS^fn -''r''"'' --^^"y '^'^ extremely 

g-ner.i. tci... in eitner Lhe sociology or die anthropology section; 

fpn''"'r'"rn?Vr"y7^f' ?T^' '^^^ ^^^^ briefest. With about 25 tcrrs 
VPP. of the Uth ed.). It can easily be incorporated and should 

ho, because- of Its widespread use and availability. 

ilarity to Tracer's .oric in c.T.t;n;°LJ'iri5n^e:er considerable si.- 

TheCentcr for Applied Linguistics is working on modifying the Universal 

?hL ho, ^° '^^"^^ the ?ancua!;e sciences 

Thxs .hould of course be incorporated into LINCS, hopefully In a forn 

dL'".":^ o'nIriDf ''''' International Fede ation for 
Docu...e.tauxon (FID). Since the UDC is an internationally accepted 

Ihe Hcuran Relations Area Files Outline of Cultural toterials 134] „as 
used in the experimental pilot thesaurus because anthropology ^i:l:n:rally 



13 



fc^C l:o be the di^cinliTKiry content v;Lthin w'liich Icxngita-e r-ost fully 
belcnr^'C. BecauGc of tho discursive naf.uro o£ the outline, the number 
of i^r.ocific terns is 5;mll. Lansua<;e seems to be covered under tv;o 
heads, lan:;iia^;e axvd cor- unicat ion, Tae final thesaurus can cai;ily 
include these tems, increasing the usability of LIMCS by anthropologists. 

Section 21 of the Bulletin S Lr.naleti au:> [lOJ includes "sciences c!u 
la::j3age" for which the table of c ents is another brief classification 
(in French) of language sciences. .its can easily be incorporated, in 
eithi^-r language. Probably no entri.cs in French should bo nude until use 
of that language and related policies have been introduced. 

The portion of the ML:\ Bibllo;>raohv [45] dealing v/ith linguistics can 
easily be incorporated also. In general, after the first tt;o or three 
brief classification sclicmes have been incorporated, it is to be expected 
that each additiv^nal classification viH add only a few terns. Horaver, 
in each of the general classifications, each term is considered relatively 
important and hence ought to be integrated in a completely general in- 
forr/ation system. 

The CAL collection of lan,;rage-or lentcd terms [AlJ \:as put together by 
Kathleen P. Le^'is for the Li::CS project fro:ri a group of thesauri with' 
laaguagc-relatcd interests. This list, of appro::imtely 2,000 terms 
is described r.ore fully in the last section of this paper. Suffice it 
to say here that the 400 terms in that list vhich arc most centrally 
linguistic could be incorporated Into the thesauru-j either at this sta^^e 
or the next. ^ 

In a four-stage approach to thesaurus construction the first stage calls 
for tentatively incorporating all usable purportedly general classifications 
of llnj^uistics or linguistic subfields and then sufeequently pruning out 
unusable ite.T^s and correcting inconsistencies oif resolving them by scope 
notes and the like. The vocabulary of the thesaurus at this stage is 
quite general and of the sort v;hich will almost certainly be incl-trlod in 
the final product. 

2.2. Sources for Expansion 

The goal of the second stage is to fill out the body of the thesaurus, 
so that it fairly represents the field of the language sciences. 

It is important, lq ViScognize that in building a th.:,aurus, we are concerned 
with LXvo types .jf tr-vinis: (a) ENTRV POINTS, v;hich should be provided both 
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for Cho i:pc'cialist; and for the non-specialist, and (b) CORE TE.RJ^IS , 
\/hi.cii should bo correctly related to each otlicr, to the literature to 
be searched, and to the entry points. The core ternis may well be the 
prir-ury entry poinfj for spocialints and hcnco can be considered a sub- 
set o'. the entry points. Further, to a large extent those terns are the 
ones v.-hich are entered first and about which the thesaurus is built. 

Selection of entry points should be based primarily on the coiamonness of 
thD terni. i-euce, for example, tlie terra 'primitive language' night be an 
encry point, with perhaps 'USE exotic language' and 'RT ethaolinguistics ' 
to -nice the user frrthor. Our sources for entry points must reflect the 
tsrrs people think of ;;hon they have questions (information needs) about 
langu,H!>-... Thus, '.widely used traditional and popular terms must be found 
and iucliidod. 



Cere ter;n relations x;ill evolve as the thesaurus maker proposes uierarchial 
relacioachips, tests them, and incorporates criticisms offered by reviev/ers 
at every -.tage. Among the relevant documentary sources here are reliable 
dictlonarie-s and articles about t-rminology. As more terms are added, it 
b.'cn:r.-jr. r.ore and more im.portant to make all appropriate connections to 
gnide tne user succei:=; fully from the terms in v;hich he conceives of his 
ii-.fomacion need co the term^ in which the need is filled in a document. 

The following sources will be important in expanding the thesaurus 
voc:ab!,;lary : 

1. II. A. Gleason, Jr., A Dictionary of Linr.uistics 
Termlnolo/'v [25 ]• 

2. Eric P. ilamp, A Glosnarv of American Technical 
Linguistic Usn'.:o: l-)25-i)0 [30!. 

3. Mario Pei, Glossary of Linguistic Tcrminolog;y 1966 [47]. 
I' Obstcr's Third International Dictionary [64].. 

5. Indexes to major works in various language science 
specialities. 

6. Work on language learning and teaching. 

IJlicn complete, Gleason's dictionary [25] will be very nearly definitive 
and comprehensive. The relationships of < troublesome term.s can certainly 
be clarified by referring to Gleason's definitions. 
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IlaiapV- Gl ciisarv [30] hat; a very iisoful introduction in v.niich he both 
doc».!.-!Ctncs the terr.is louiid in the ^^Lossary and Indicates where terms can 
be found to take the user past 1950. ills i>ources for both included and 
newer tornn are te::tbooI:s and important book or papers. Consulting this 
bool: is a nust for e::pa:idLng the L[:';CS thesauruii. Ajnon^ other virtues, 
it clearly der^cribes its princiivlc.i; of selection and method of construction 
Further nor e, it has a Rusiiian translatLon. 

Becaui.;e of its relative brevity the j^eneral acce.ssibility, Pel's revised 
.glo:;;s::ry [47] should be a mjor source of terms for tlie LINCS thesaurus. 
It would be good if the user of LIKCS, finding a term iu Pel, could use 
than term in approaching the LINCS thesaurus. Thus, it is not so much 
th.e entries in Pel which sliould be incorporated, but the terms he us^^s 
in definitions, vhat the user learns by going to Poi. William Gate./ 
review; of the i^lossary can be of assistance in makin^^ best use of this 
source [24]. The in^.nortance of this source lies in the fact that most 
;il^rari:j, academic and snall-to™, have Pei on their shelves, so many 
entry polntii.- are generated fro:n this glossary. 

j'L^h:?{:ov' s Third [64] asked lingtiists for help In selecting and defining 
iin-ui^itlc terr?.3. It is both authoritative and widely used. Again, an 
xvith Che previous sources, each dictionary entry is likely to reveal 
ratlier explicitly the three COSATI relationships of NT, iVf, and RT 
sinply in the logic of the definition. Thus, they arc not only s^v.';-rcs 
of terras, but rr.ore importantly, sources of inforniation on so:ne mc:; 
less widely accepted viei;s of their losical or pra-nutic relationsu .i>s . 

In referring to indorses to major works in the language sciences ve have 
changed our emphasis from classification as in a tabic of contents, to 
simple listing of important terms. It may be noted that if the thesaurus 
is kept alphabetical, comparisons v/ith indc:-:os (also alphabetical) v/ill 
be Aork of an essentially clerical nature. Oth^ir sources in the 
literature can be used, such as various specific bibliocrraphies dis- 
cussions of terminology, etc. ^ * 

Books dealing with language learning and language teaching will include 
terms of wide' user interest. To be noted in this regard arc Blair [6], 
Narvoson [46], Steible [52], and Walsh [63]. Each of these is designed 
to help the student of language or literature to deal with the linguistic 
terms which he encounters. 



2.3. Sources for Completion 

The third stage of thesaurus construction is concerned v;ith makinc^ the 
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thc3.yarus complc-tc. Sources used at this stage fill in gaps in new 
termxnolosy, popular terms, and new areas, and provide multilinc^ual 
equxvnients xf the thesaurus is to be made multilingual. Also It this 
L 4t-..iM-c!n"f"'^ '° -.ke sure that the terminology in the thesaurus 
H ^ ^ arranged. For example, terms from various theoretical 

"o°uc^ ifvo ^ "'Tl' "''""'"S — -^l- -me 

^.ouna. It v.ould be misleading to have one school represented only by 

terms m pnonology and another only by terms in syntax. 

The sources v;hich could be of use in putting the thesaurus together are 

ba , '° "^^^"'"^ ^^^'^ ^^'^ completeness a^d 

balance and then to correct deficiencies. The following will be discussed: 

1. C. F. Voor;elin "A Sample of Technical Terms in 
Linguistics" 194S [62] . 

2. Articles about terminology in Studies in Lin^u. <.t^r. 
by Hall [28] and Uockett [29] and similar articles. 

3. Bool:.; dcali;- with specific theoretical schools of 
liusuiDtics, e.g. Vaciiek and Dubsky, Dictionnalre de 
Linp.ui.stiouc do I'Ecole de ?rat;i;e [6lX 

A. Reference vrorks in language-related fields [42]. 

5. Experiments with the terminology of the language 
sciences. 

6. Multilingual dictionaries, e.g. Narouzeau Lcxiguc de 
la Tcrminolo..-rie Linaiiis ti g un [44]. Axmanova Slovar' 
.Linf:vi s t ices l:ix Tcrmino v [5j. ^ 

7. Monolingual non-English dictionaries, e.g. Felice [23], 
Godel [27]. . l j , 

Voegelin's article includes a large number of linguistic terms drawn 
from articles by Sturtevant. Sapir, Bloomfield. Harris, and TrubeSkoj ' 
m the areas of structural linguistics, historical linguistics, dialect 
geography and linguistic typology. Unless it has been clear!; i^! 
corporaced in one of the other sources, such as Hamp or Glcason each 
term must be checked before the thesaurus can be called complete. 

The articles by Hall and Hockett in Studios in I.i..,,.- deal respectively 
with tne basic terms in linguistics and with terms in his toricaf Unguis'S. 



17 



12 



Aithoui.li SIL lias devoted considerable space to questions of tcrninology, s, 
it is not the only journal to consider terminology. liic bibliographical* 
listing prepared by Kataieen P. Lev.-is for LItXS, "Indexing tools '-nd 
tonnlnoloay sources in the language sciences," [.';2] includes a number of 
articles in various languages on linguistic terminology. 

To round out the picture of various theoretical viewpoints given in 
LIKCS, it will be necessary to curiiult specific sources for each school. 
Worth n-.entioning, in addition to the Dictionnairci of Vachek and Dubsky,* 
arc njel,T.sIev's Prole^^or-ena to a Theory of Lan->uaae [31j which devotes 'a 
chapter to definitions, and books by people in the various nev;er schools 
such as Chomsky, Pike, Lanib, and Halliday. 

Interdisciplinary areas also need special attention to make sure that 
coverage is not spotty. Reference books in the Lov;ls biblio^iranhv (4''1 
should be consulted. t j l 

For purposes of inf orirration retrieval, it is ncrinal to construct a 
thesaurus from index teres which worc: selected from a real corpus of 
docunents. It '.-.-ould be v;iso not to neglect the kind of cxperi:?.anLation 
that oiten acco:!:panie:i the iiiore empirical approach to tliesaurus 
cons cruet iovi. Thus non-documentary sources of infonnution about the 
terminology of the language sciences include oxperiments in free indexing 
of linguistic documents (treating all index terms as candidates for 
iiiclusion in the thesaurus), experiments in retrieval of information for 
users (all terms used By users to bo included in the thesaurus as needed 
entry points), statistical studies of important language science literature 
(where the statistics could be merely tabulation of terms used) , and 
questionnaires about terminology. 

Assuming tliat the theriaurus is to be multilingual, this is the point at 
which conversions can be made. It may be a programmable operation to 
look up all the English terms in the thesaurus in the multilingual 
dictionaries and get equivalents. Kote that we either assume that English 
is the pivot language in which both indexing and searching is done or°that 
fiome multilingual arrangements can be made about processing. One possibil- 
ity is to treat as English words all non-English words which have no 
popular English equivalents. The dictionaries listed have several languages 
Marouzeau has French, English, Gorman, Italian, and Russian (Russian onlv 
in tne Russian translation by Andreev) j Axmanova has Russian, English, French 
German, and Spanish. Other smaller scale multi-or bilingual dictionaries 
are cited in the Lp'.jIs bibliography [42]. 
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Monolingual non-English dictionaries may bo the main source of* information 
about tcrninologica used in non-En^^lish theoretical schools- Godel deals 
with Snussurian terminology and Felice v/ith the school of Ascoli. 



2,4* Sources for Lanc;-Ran£:e Maintenance and Updatln:.:: 

Only a prophet could really say V7hat sources v/ill be useful in the future. 
However, certain sources can be expected to be updated periodically because 
of their function. These sourcer are brought together here. All of the 
names sources v;erc also mentioned above. The sources are placed in three 
groups: (1) general, (2) specif :.c, (3) direct. The direct sources will 
have least lag, because they represent the document or user very directly. 
The specific sources are next most current, because they represent the 
docur.ic-nt or user very directly. The specific sources are next most current 
because th.ey deal v/ith circumscribed areas vrhcre there is some some concern 
about keeping abreast of things among people working in that area. The 
general sources have must lag, because language-related materials are only 
a part of cheir re:;paiisibillty or because they are responsible for very 
nearly the i/hole field of language sciences. 

a. General Sources: 

1. Bulletin Sipnaletique [lOj, 

2. Linr.!;uis tic Bibliography [15], 

3. ERIC Clearinghouse for Linguistics [22], 

4. MLA niblio^ranhv [45], 

5. Project LEX Thesaurus [20], 

b. Specific Sources: 

1. ETIC/CILT classification [S], 

2. Sumraer Institute of Linguistics Biblior.raphv [53], 

3. Lan?>uaf:o Ref^earch in ?ro,c:ress [11]. 

iifLQ£^l^S9--.^lQ^^,HLS^S£j^ [13] and 

its thesaurus. 
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c. Direct f^ourcGs: 



1. Ei-cperlti^ents in free indexincr, information 
retrieval, word statistics. 

2. Indexes, tables of contentr. and revievs of new 
books, issues of journals, review articles, 
bibliographies on specific topics, new popu- 
larizations , 



3, Interfaces with LllsCS 



Unen we speak of interfaces, we are of course prinarily coPcemed with 
systems V7aich will be functionipg In son:e way when Li:;C5 is in fnli 
operaLion. Of course many infornation syscens are concerned with iin- 

^ ^^^^ peripheral way. It is highly unlikelv that 
LI.Co wtll xntoract with theii^ A >;..r who would cc:ne to lAUCS wn^Ld 
not also go for exar^pie to Cheriic.:! Abstracts for the same inforr:ati on 
1 le svGtenis most likely to inLeriace with Ll±S are the sys terns fron 
w'"" '';'t^";c^'''''^ extracted t.-rns to be examined in the inicial stages of 
the LluCb proj^.ct [,1], cr systems vary siinilar to theni. Her source., 
include the following: 

1- Abstracts , published by the Ar.ericin 
Speech and Hearing Association [3], abbreviated 
dsh. 

2- Lanp.ua^^e Research in Pro ;^.rcss thesaurus [11] 
abbreviated irp, ^Vs mentioned above, this is 
essentially a linguistic source with leanings 
towards psycholinguistics and speech. 

3. Western Reserve University Education Thesaurus 
[12] abbreviated wru. 

4. National Library of Medicine [16] abbreviated 
nlm. 

5. Defense Docurr.entation Center [19] abbreviated 
ddc, 

6. Project Lex, Departr.ent of Defense [20] 
abbreviated lex. 
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1. HuiT^an rie la Lions Area Files, OuUline of Cultural 
Materials [34J, abbreviated hrf. 

8. The Johr4S Hopkins University Information Center 
for Hearing, Speech and Disorders of Hvur.an 
Coriinuuication [35], abbreviated jhu. 

9. Medical Subject Headings [56], abbreviated msh. 

10. Thesaurus of ERIC Descriptors [57], abbreviated ere. 

Approxiuiately 2,000 of the Lewis entries were listed on a computer print- 
out. To get so.-!o idea of the relationship of these entries to probable 
tcrr.is of the LIMCS thesaurus, the printout v/as examined for terms which 
reasonably could be said to be relevant to linguistics, the core field 
of the language sciences. The list of core terms and the sources from 
which they v/ore taken is given in the Appendix. This represents a fair 
proportion of the total list of language-related terms extracted from 
the interface sources, 403 terras appearin:; a total of 635 times; i.e., 
635 of the 2,156 entries in the list fall within the doiirain of linguistics 
as character j.scd by tlris investigator. 

What does this mean in term^ of interface operations? It means at least 
that users who go to these sources for information in these areas could 
better profit frori access through LIKCS to the literature of linguistics. 
The sources vary considerably in their area of concentration in language. 
Tabic 1 shows how this works out numerically. For each of the following 
general areas, the number of terras in that area in each of the sources 
is given: phonology, grammar, semantics, general, and language names, 
tony terms occur in more than one source, so the total of terms by sources 
in a given area is larger than the actual total number of different terms 
in that area. The highest number in each column and each row is circled 
to indicate concentrations of terminology. It must be pointed out that 
the sources differ considerably in length, the most obvious difference 
being that the Human Relations Area Files source suggests terms for use 
by people rather than explicit terms for use by a computer. 

The sources which show the m.ost overlap with linguistics arc Language 
.^gj-g,^-£g!_^ in Progress , the Johns Hopkins thesaurus, and the ERIC thesaurus. 
ERIC is heavy on language names; Johns Hopkins has nearly all the phonology 
terms on the list; and Language Research in Progress is high in ail cate- 
gories except language names. The Appendix shows exactly how this works 
out, and what each of the categories includes. * 
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Table 1. f:u.:bor of Tern^.s in Each G.noral Linguistic Area in Each 

of Ton Languag. -KolaLed Sources •• 



Phono lo^-y 

Gra:n?iir 

Serr.nitics 

General 

La-aguase 
Nar.;us 

Tocal 



107 
55 
37 
87 

121 



{ 8 i 21 



0 

1 

4 

._0 
13 



19 
.18 
! 41 



8 



403 different terms 
635 tokens 




10 3 

6 3 3 
0 'v 9 ) 5 



6y' 13 



19 i J.1 i 4 6 : 
V IR A ^ o 



6 
4 
12 



hrf 


jhu 


msh 


ere 


2 1 


© 


5 


20 


. 1 


16 


0 


(22) 




4 
IS 


3 


10 

\ 43 i 



_2 

43 



_0 

19 



_2^ 

22 



Q 2 

51 9 



14 0 UI7; 
'J 33) 16 •222 1 



» The highest nu;r,her in oacli co].un,n and each row is circled to indicate 
concentrations of terir.inology. i- <.u lo maicate ^ 
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It secin:> appropriate to include ail of tlveno central torms in the 
thesaurus in viev: of tlio fwict that they can serve not only as useful 
tenr.s in the LIICCS thesaurus, but also as links bet^.^7een the tliesaurus 
and these other indexing tools, Conceivonbly each terni in LB:C3 that 
does appear in the same sense in an interfacing retrieval tool could 
be coded to indicate the tool iu which it appears. 

The nost important services to the user raay be to help him know v:here 
to look. If a user conies to the *'^;rong'' system, that system will be 
most useful if it can tell hiin what other system may be able to help 
him, Ovc possible way to do this could be for each system in a related 
group of systeir.5 to have a master thesaurus containing all terms in all 
connected systems, as of some recent date. For systems which are most 
closely related, there might be a computerized link loading from one 
into the next via tiie term.s shared by the systems • To the extent that 
systems shar^^ not only terms but source dociunents, the same documents 
may be reached by different routes in different systems. If the v;hole 
of Lan.^rt^age R esearc h in Progress and its source dociiments are incorporated 
into LIXCS 3 thou Li:iCS v/iii have i:ude a coi-mitr-ent to psycholinguis tics 
which v/ill put it so;nev;hat deeper in contact with some of the other 
sourcefi. This :ni:;ht require an extensive sti.dy of overlap with Psvcho - 
lo.^^ i.ca 1 Ab c> t: ra c is [2], a task v;hich has not yet been done. Similarlj' 
connections wlch informatiou science [17] and com.puter science [4j 
may need to be carefull}^ explored. 
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APPENDIX 

CORE TER^;S IN THE LP^CS CGLIZCTION OJ? UliGHAGE ORIENTED TERMS 

The sources used in this appendix are those discussed in socLion 3 
Interfaces uLth LIICCS: ' 

dsh - dsh Abstracts of i\merican Speech and Hearing 
Association [3 J ; 

Irp - La n g ua r;e Re s o a ^' c|]_j£ij^ r o ;.vr ess [II]; 

wru - Western Reserve Edacation Thesaurus [I2j; 

nlrri - National Library of Kcdicine [16 J; 

ddc - Defense Docuinentation Center [19]; 

IcK - Project Lli:< [20]; 

hrf - Hu:::an Relations Area Files [34] ; 

jhu - Johns Hopkins (speech related) [35]; 

*msh - Medical Subject Headings [56]: 

ere - ERIC Descriptors [57]- 

Terms are grouped by general areas of linguistics to v;hich they are 
related. An x at the intersection of a term and a source means that 
that term is found in that source. 
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